Overview
Brought to you by YData
Dataset statistics
| Train data report | Test data report | |
|---|---|---|
| Number of variables | 12 | 11 |
| Number of observations | 891 | 418 |
| Missing cells | 866 | 414 |
| Missing cells (%) | 8.1% | 9.0% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 83.7 KiB | 36.1 KiB |
| Average record size in memory | 96.1 B | 88.3 B |
Variable types
| Train data report | Test data report | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 3 |
| Text | 3 | 3 |
| Train data report | Test data report | |
|---|---|---|
Sex is highly overall correlated with Survived | Alert not present in this dataset | High correlation |
Survived is highly overall correlated with Sex | Alert not present in this dataset | High correlation |
Age has 177 (19.9%) missing values | Age has 86 (20.6%) missing values | Missing |
Cabin has 687 (77.1%) missing values | Cabin has 327 (78.2%) missing values | Missing |
PassengerId is uniformly distributed | PassengerId is uniformly distributed | Uniform |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 608 (68.2%) zeros | SibSp has 283 (67.7%) zeros | Zeros |
Parch has 678 (76.1%) zeros | Parch has 324 (77.5%) zeros | Zeros |
Fare has 15 (1.7%) zeros | Alert not present in this dataset | Zeros |
Reproduction
| Train data report | Test data report | |
|---|---|---|
| Analysis started | 2025-02-27 19:42:12.830895 | 2025-02-27 19:42:17.456738 |
| Analysis finished | 2025-02-27 19:42:14.673000 | 2025-02-27 19:42:19.513129 |
| Duration | 1.84 second | 2.06 seconds |
| Software version | ydata-profiling vv4.12.2 | ydata-profiling vv4.12.2 |
| Download configuration | config.json | config.json |
Variables
PassengerId
Real number (ℝ)
| Train data report | Test data report | |
|---|---|---|
| Distinct | 891 | 418 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 446 | 1100.5 |
| Train data report | Test data report | |
|---|---|---|
| Minimum | 1 | 892 |
| Maximum | 891 | 1309 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Quantile statistics
| Train data report | Test data report | |
|---|---|---|
| Minimum | 1 | 892 |
| 5-th percentile | 45.5 | 912.85 |
| Q1 | 223.5 | 996.25 |
| median | 446 | 1100.5 |
| Q3 | 668.5 | 1204.75 |
| 95-th percentile | 846.5 | 1288.15 |
| Maximum | 891 | 1309 |
| Range | 890 | 417 |
| Interquartile range (IQR) | 445 | 208.5 |
Descriptive statistics
| Train data report | Test data report | |
|---|---|---|
| Standard deviation | 257.35384 | 120.81046 |
| Coefficient of variation (CV) | 0.57702655 | 0.10977779 |
| Kurtosis | -1.2 | -1.2 |
| Mean | 446 | 1100.5 |
| Median Absolute Deviation (MAD) | 223 | 104.5 |
| Skewness | 0 | 0 |
| Sum | 397386 | 460009 |
| Variance | 66231 | 14595.167 |
| Monotonicity | Strictly increasing | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 599 | 1 | 0.1% |
| 588 | 1 | 0.1% |
| 589 | 1 | 0.1% |
| 590 | 1 | 0.1% |
| 591 | 1 | 0.1% |
| 592 | 1 | 0.1% |
| 593 | 1 | 0.1% |
| 594 | 1 | 0.1% |
| 595 | 1 | 0.1% |
| Other values (881) | 881 |
| Value | Count | Frequency (%) |
| 892 | 1 | 0.2% |
| 1205 | 1 | 0.2% |
| 1177 | 1 | 0.2% |
| 1176 | 1 | 0.2% |
| 1175 | 1 | 0.2% |
| 1174 | 1 | 0.2% |
| 1173 | 1 | 0.2% |
| 1172 | 1 | 0.2% |
| 1171 | 1 | 0.2% |
| 1170 | 1 | 0.2% |
| Other values (408) | 408 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 892 | 1 | |
| 893 | 1 | |
| 894 | 1 | |
| 895 | 1 | |
| 896 | 1 | |
| 897 | 1 | |
| 898 | 1 | |
| 899 | 1 | |
| 900 | 1 | |
| 901 | 1 |
| Value | Count | Frequency (%) |
| 892 | 1 | |
| 893 | 1 | |
| 894 | 1 | |
| 895 | 1 | |
| 896 | 1 | |
| 897 | 1 | |
| 898 | 1 | |
| 899 | 1 | |
| 900 | 1 | |
| 901 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 1 | 342 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 1 | 342 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 1 | 342 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 1 | 342 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 1 | 342 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 891 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 549 | |
| 1 | 342 |
Pclass
Categorical
| Train data report | Test data report | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.3% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Train data report | Test data report | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Unique
| Train data report | Test data report | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Train data report | Test data report | |
|---|---|---|
| 1st row | 3 | 3 |
| 2nd row | 1 | 3 |
| 3rd row | 3 | 2 |
| 4th row | 1 | 3 |
| 5th row | 3 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 491 | |
| 1 | 216 | |
| 2 | 184 | 20.7% |
| Value | Count | Frequency (%) |
| 3 | 218 | |
| 1 | 107 | |
| 2 | 93 |
Length
Histogram of lengths of the category
Common Values (Plot)
Train data report
Test data report
| Value | Count | Frequency (%) |
| 3 | 491 | |
| 1 | 216 | |
| 2 | 184 | 20.7% |
| Value | Count | Frequency (%) |
| 3 | 218 | |
| 1 | 107 | |
| 2 | 93 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 491 | |
| 1 | 216 | |
| 2 | 184 | 20.7% |
| Value | Count | Frequency (%) |
| 3 | 218 | |
| 1 | 107 | |
| 2 | 93 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 891 |
| Value | Count | Frequency (%) |
| (unknown) | 418 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 491 | |
| 1 | 216 | |
| 2 | 184 | 20.7% |
| Value | Count | Frequency (%) |
| 3 | 218 | |
| 1 | 107 | |
| 2 | 93 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 891 |
| Value | Count | Frequency (%) |
| (unknown) | 418 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 491 | |
| 1 | 216 | |
| 2 | 184 | 20.7% |
| Value | Count | Frequency (%) |
| 3 | 218 | |
| 1 | 107 | |
| 2 | 93 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 891 |
| Value | Count | Frequency (%) |
| (unknown) | 418 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 491 | |
| 1 | 216 | |
| 2 | 184 | 20.7% |
| Value | Count | Frequency (%) |
| 3 | 218 | |
| 1 | 107 | |
| 2 | 93 |
Name
['Text', 'Text']
| Train data report | Test data report | |
|---|---|---|
| Distinct | 891 | 418 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Length
| Train data report | Test data report | |
|---|---|---|
| Max length | 82 | 63 |
| Median length | 52 | 51 |
| Mean length | 26.965208 | 27.483254 |
| Min length | 12 | 13 |
Unique
| Train data report | Test data report | |
|---|---|---|
| Unique | 891 | 418 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Train data report | Test data report | |
|---|---|---|
| 1st row | Braund, Mr. Owen Harris | Kelly, Mr. James |
| 2nd row | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | Wilkes, Mrs. James (Ellen Needs) |
| 3rd row | Heikkinen, Miss. Laina | Myles, Mr. Thomas Francis |
| 4th row | Futrelle, Mrs. Jacques Heath (Lily May Peel) | Wirz, Mr. Albert |
| 5th row | Allen, Mr. William Henry | Hirvonen, Mrs. Alexander (Helga E Lindqvist) |
| Value | Count | Frequency (%) |
| mr | 521 | 14.4% |
| miss | 182 | 5.0% |
| mrs | 129 | 3.6% |
| william | 64 | 1.8% |
| john | 44 | 1.2% |
| master | 40 | 1.1% |
| henry | 35 | 1.0% |
| george | 24 | 0.7% |
| james | 24 | 0.7% |
| charles | 23 | 0.6% |
| Other values (1515) | 2538 |
| Value | Count | Frequency (%) |
| mr | 242 | 14.0% |
| miss | 78 | 4.5% |
| mrs | 72 | 4.2% |
| john | 28 | 1.6% |
| william | 23 | 1.3% |
| master | 21 | 1.2% |
| charles | 16 | 0.9% |
| joseph | 15 | 0.9% |
| james | 14 | 0.8% |
| henry | 14 | 0.8% |
| Other values (825) | 1202 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2735 | 11.4% | |
| r | 1958 | 8.1% |
| e | 1703 | 7.1% |
| a | 1657 | 6.9% |
| i | 1325 | 5.5% |
| n | 1304 | 5.4% |
| s | 1297 | 5.4% |
| M | 1128 | 4.7% |
| l | 1067 | 4.4% |
| o | 1008 | 4.2% |
| Other values (50) | 8844 |
| Value | Count | Frequency (%) |
| 1309 | 11.4% | |
| r | 971 | 8.5% |
| e | 822 | 7.2% |
| a | 786 | 6.8% |
| s | 628 | 5.5% |
| i | 621 | 5.4% |
| n | 596 | 5.2% |
| l | 526 | 4.6% |
| M | 515 | 4.5% |
| o | 467 | 4.1% |
| Other values (48) | 4247 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 24026 |
| Value | Count | Frequency (%) |
| (unknown) | 11488 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2735 | 11.4% | |
| r | 1958 | 8.1% |
| e | 1703 | 7.1% |
| a | 1657 | 6.9% |
| i | 1325 | 5.5% |
| n | 1304 | 5.4% |
| s | 1297 | 5.4% |
| M | 1128 | 4.7% |
| l | 1067 | 4.4% |
| o | 1008 | 4.2% |
| Other values (50) | 8844 |
| Value | Count | Frequency (%) |
| 1309 | 11.4% | |
| r | 971 | 8.5% |
| e | 822 | 7.2% |
| a | 786 | 6.8% |
| s | 628 | 5.5% |
| i | 621 | 5.4% |
| n | 596 | 5.2% |
| l | 526 | 4.6% |
| M | 515 | 4.5% |
| o | 467 | 4.1% |
| Other values (48) | 4247 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 24026 |
| Value | Count | Frequency (%) |
| (unknown) | 11488 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2735 | 11.4% | |
| r | 1958 | 8.1% |
| e | 1703 | 7.1% |
| a | 1657 | 6.9% |
| i | 1325 | 5.5% |
| n | 1304 | 5.4% |
| s | 1297 | 5.4% |
| M | 1128 | 4.7% |
| l | 1067 | 4.4% |
| o | 1008 | 4.2% |
| Other values (50) | 8844 |
| Value | Count | Frequency (%) |
| 1309 | 11.4% | |
| r | 971 | 8.5% |
| e | 822 | 7.2% |
| a | 786 | 6.8% |
| s | 628 | 5.5% |
| i | 621 | 5.4% |
| n | 596 | 5.2% |
| l | 526 | 4.6% |
| M | 515 | 4.5% |
| o | 467 | 4.1% |
| Other values (48) | 4247 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 24026 |
| Value | Count | Frequency (%) |
| (unknown) | 11488 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2735 | 11.4% | |
| r | 1958 | 8.1% |
| e | 1703 | 7.1% |
| a | 1657 | 6.9% |
| i | 1325 | 5.5% |
| n | 1304 | 5.4% |
| s | 1297 | 5.4% |
| M | 1128 | 4.7% |
| l | 1067 | 4.4% |
| o | 1008 | 4.2% |
| Other values (50) | 8844 |
| Value | Count | Frequency (%) |
| 1309 | 11.4% | |
| r | 971 | 8.5% |
| e | 822 | 7.2% |
| a | 786 | 6.8% |
| s | 628 | 5.5% |
| i | 621 | 5.4% |
| n | 596 | 5.2% |
| l | 526 | 4.6% |
| M | 515 | 4.5% |
| o | 467 | 4.1% |
| Other values (48) | 4247 |
Sex
Categorical
| Train data report | Test data report | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.2% | 0.5% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Train data report | Test data report | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.704826 | 4.7272727 |
| Min length | 4 | 4 |
Unique
| Train data report | Test data report | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Train data report | Test data report | |
|---|---|---|
| 1st row | male | male |
| 2nd row | female | female |
| 3rd row | female | male |
| 4th row | female | male |
| 5th row | male | female |
Common Values
| Value | Count | Frequency (%) |
| male | 577 | |
| female | 314 |
| Value | Count | Frequency (%) |
| male | 266 | |
| female | 152 |
Length
Histogram of lengths of the category
Common Values (Plot)
Train data report
Test data report
| Value | Count | Frequency (%) |
| male | 577 | |
| female | 314 |
| Value | Count | Frequency (%) |
| male | 266 | |
| female | 152 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1205 | |
| m | 891 | |
| a | 891 | |
| l | 891 | |
| f | 314 | 7.5% |
| Value | Count | Frequency (%) |
| e | 570 | |
| m | 418 | |
| a | 418 | |
| l | 418 | |
| f | 152 | 7.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4192 |
| Value | Count | Frequency (%) |
| (unknown) | 1976 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 1205 | |
| m | 891 | |
| a | 891 | |
| l | 891 | |
| f | 314 | 7.5% |
| Value | Count | Frequency (%) |
| e | 570 | |
| m | 418 | |
| a | 418 | |
| l | 418 | |
| f | 152 | 7.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4192 |
| Value | Count | Frequency (%) |
| (unknown) | 1976 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 1205 | |
| m | 891 | |
| a | 891 | |
| l | 891 | |
| f | 314 | 7.5% |
| Value | Count | Frequency (%) |
| e | 570 | |
| m | 418 | |
| a | 418 | |
| l | 418 | |
| f | 152 | 7.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4192 |
| Value | Count | Frequency (%) |
| (unknown) | 1976 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 1205 | |
| m | 891 | |
| a | 891 | |
| l | 891 | |
| f | 314 | 7.5% |
| Value | Count | Frequency (%) |
| e | 570 | |
| m | 418 | |
| a | 418 | |
| l | 418 | |
| f | 152 | 7.7% |
Age
Real number (ℝ)
| Train data report | Test data report | |
|---|---|---|
| Distinct | 88 | 79 |
| Distinct (%) | 12.3% | 23.8% |
| Missing | 177 | 86 |
| Missing (%) | 19.9% | 20.6% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.699118 | 30.27259 |
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0.42 | 0.17 |
| Maximum | 80 | 76 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Quantile statistics
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0.42 | 0.17 |
| 5-th percentile | 4 | 8 |
| Q1 | 20.125 | 21 |
| median | 28 | 27 |
| Q3 | 38 | 39 |
| 95-th percentile | 56 | 57 |
| Maximum | 80 | 76 |
| Range | 79.58 | 75.83 |
| Interquartile range (IQR) | 17.875 | 18 |
Descriptive statistics
| Train data report | Test data report | |
|---|---|---|
| Standard deviation | 14.526497 | 14.181209 |
| Coefficient of variation (CV) | 0.48912219 | 0.46845047 |
| Kurtosis | 0.17827415 | 0.083783352 |
| Mean | 29.699118 | 30.27259 |
| Median Absolute Deviation (MAD) | 9 | 8 |
| Skewness | 0.38910778 | 0.45736129 |
| Sum | 21205.17 | 10050.5 |
| Variance | 211.01912 | 201.1067 |
| Monotonicity | Not monotonic | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 24 | 30 | 3.4% |
| 22 | 27 | 3.0% |
| 18 | 26 | 2.9% |
| 28 | 25 | 2.8% |
| 30 | 25 | 2.8% |
| 19 | 25 | 2.8% |
| 21 | 24 | 2.7% |
| 25 | 23 | 2.6% |
| 36 | 22 | 2.5% |
| 29 | 20 | 2.2% |
| Other values (78) | 467 | |
| (Missing) | 177 | 19.9% |
| Value | Count | Frequency (%) |
| 24 | 17 | 4.1% |
| 21 | 17 | 4.1% |
| 22 | 16 | 3.8% |
| 30 | 15 | 3.6% |
| 18 | 13 | 3.1% |
| 27 | 12 | 2.9% |
| 26 | 12 | 2.9% |
| 23 | 11 | 2.6% |
| 25 | 11 | 2.6% |
| 29 | 10 | 2.4% |
| Other values (69) | 198 | |
| (Missing) | 86 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.1% |
| 0.67 | 1 | 0.1% |
| 0.75 | 2 | 0.2% |
| 0.83 | 2 | 0.2% |
| 0.92 | 1 | 0.1% |
| 1 | 7 | |
| 2 | 10 | |
| 3 | 6 | |
| 4 | 10 | |
| 5 | 4 | 0.4% |
| Value | Count | Frequency (%) |
| 0.17 | 1 | 0.2% |
| 0.33 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 2 | |
| 3 | 1 | 0.2% |
| 5 | 1 | 0.2% |
| 6 | 3 |
| Value | Count | Frequency (%) |
| 0.17 | 1 | 0.1% |
| 0.33 | 1 | 0.1% |
| 0.75 | 1 | 0.1% |
| 0.83 | 1 | 0.1% |
| 0.92 | 1 | 0.1% |
| 1 | 3 | |
| 2 | 2 | |
| 3 | 1 | 0.1% |
| 5 | 1 | 0.1% |
| 6 | 3 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | 0.5% |
| 0.83 | 2 | 0.5% |
| 0.92 | 1 | 0.2% |
| 1 | 7 | |
| 2 | 10 | |
| 3 | 6 | |
| 4 | 10 | |
| 5 | 4 | 1.0% |
SibSp
Real number (ℝ)
| Train data report | Test data report | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 0.8% | 1.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.52300786 | 0.44736842 |
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 608 | 283 |
| Zeros (%) | 68.2% | 67.7% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Quantile statistics
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 3 | 2 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Train data report | Test data report | |
|---|---|---|
| Standard deviation | 1.1027434 | 0.89675956 |
| Coefficient of variation (CV) | 2.1084644 | 2.0045214 |
| Kurtosis | 17.88042 | 26.498712 |
| Mean | 0.52300786 | 0.44736842 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.6953517 | 4.1683366 |
| Sum | 466 | 187 |
| Variance | 1.2160431 | 0.80417771 |
| Monotonicity | Not monotonic | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) |
| 0 | 608 | |
| 1 | 209 | 23.5% |
| 2 | 28 | 3.1% |
| 4 | 18 | 2.0% |
| 3 | 16 | 1.8% |
| 8 | 7 | 0.8% |
| 5 | 5 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 110 | 26.3% |
| 2 | 14 | 3.3% |
| 3 | 4 | 1.0% |
| 4 | 4 | 1.0% |
| 8 | 2 | 0.5% |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 608 | |
| 1 | 209 | 23.5% |
| 2 | 28 | 3.1% |
| 3 | 16 | 1.8% |
| 4 | 18 | 2.0% |
| 5 | 5 | 0.6% |
| 8 | 7 | 0.8% |
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 110 | 26.3% |
| 2 | 14 | 3.3% |
| 3 | 4 | 1.0% |
| 4 | 4 | 1.0% |
| 5 | 1 | 0.2% |
| 8 | 2 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 110 | 12.3% |
| 2 | 14 | 1.6% |
| 3 | 4 | 0.4% |
| 4 | 4 | 0.4% |
| 5 | 1 | 0.1% |
| 8 | 2 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 608 | |
| 1 | 209 | 50.0% |
| 2 | 28 | 6.7% |
| 3 | 16 | 3.8% |
| 4 | 18 | 4.3% |
| 5 | 5 | 1.2% |
| 8 | 7 | 1.7% |
Parch
Real number (ℝ)
| Train data report | Test data report | |
|---|---|---|
| Distinct | 7 | 8 |
| Distinct (%) | 0.8% | 1.9% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.38159371 | 0.3923445 |
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 6 | 9 |
| Zeros | 678 | 324 |
| Zeros (%) | 76.1% | 77.5% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Quantile statistics
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 6 | 9 |
| Range | 6 | 9 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Train data report | Test data report | |
|---|---|---|
| Standard deviation | 0.80605722 | 0.98142888 |
| Coefficient of variation (CV) | 2.1123441 | 2.5014468 |
| Kurtosis | 9.7781252 | 31.412513 |
| Mean | 0.38159371 | 0.3923445 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.749117 | 4.6544617 |
| Sum | 340 | 164 |
| Variance | 0.64972824 | 0.96320264 |
| Monotonicity | Not monotonic | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) |
| 0 | 678 | |
| 1 | 118 | 13.2% |
| 2 | 80 | 9.0% |
| 5 | 5 | 0.6% |
| 3 | 5 | 0.6% |
| 4 | 4 | 0.4% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 324 | |
| 1 | 52 | 12.4% |
| 2 | 33 | 7.9% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.5% |
| 9 | 2 | 0.5% |
| 6 | 1 | 0.2% |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 678 | |
| 1 | 118 | 13.2% |
| 2 | 80 | 9.0% |
| 3 | 5 | 0.6% |
| 4 | 4 | 0.4% |
| 5 | 5 | 0.6% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 324 | |
| 1 | 52 | 12.4% |
| 2 | 33 | 7.9% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.5% |
| 5 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| 9 | 2 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 324 | |
| 1 | 52 | 5.8% |
| 2 | 33 | 3.7% |
| 3 | 3 | 0.3% |
| 4 | 2 | 0.2% |
| 5 | 1 | 0.1% |
| 6 | 1 | 0.1% |
| 9 | 2 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 678 | |
| 1 | 118 | 28.2% |
| 2 | 80 | 19.1% |
| 3 | 5 | 1.2% |
| 4 | 4 | 1.0% |
| 5 | 5 | 1.2% |
| 6 | 1 | 0.2% |
Ticket
['Text', 'Text']
| Train data report | Test data report | |
|---|---|---|
| Distinct | 681 | 363 |
| Distinct (%) | 76.4% | 86.8% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Length
| Train data report | Test data report | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.7508418 | 6.8755981 |
| Min length | 3 | 3 |
Unique
| Train data report | Test data report | |
|---|---|---|
| Unique | 547 | 321 ? |
| Unique (%) | 61.4% | 76.8% |
Sample
| Train data report | Test data report | |
|---|---|---|
| 1st row | A/5 21171 | 330911 |
| 2nd row | PC 17599 | 363272 |
| 3rd row | STON/O2. 3101282 | 240276 |
| 4th row | 113803 | 315154 |
| 5th row | 373450 | 3101298 |
| Value | Count | Frequency (%) |
| pc | 60 | 5.3% |
| c.a | 27 | 2.4% |
| a/5 | 17 | 1.5% |
| ca | 14 | 1.2% |
| ston/o | 12 | 1.1% |
| 2 | 12 | 1.1% |
| sc/paris | 9 | 0.8% |
| w./c | 9 | 0.8% |
| soton/o.q | 8 | 0.7% |
| 347082 | 7 | 0.6% |
| Other values (709) | 955 |
| Value | Count | Frequency (%) |
| pc | 32 | 5.9% |
| c.a | 19 | 3.5% |
| ca | 8 | 1.5% |
| soton/o.q | 8 | 1.5% |
| sc/paris | 7 | 1.3% |
| 17608 | 5 | 0.9% |
| 2 | 5 | 0.9% |
| a/5 | 5 | 0.9% |
| w./c | 5 | 0.9% |
| f.c.c | 4 | 0.7% |
| Other values (383) | 445 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 746 | |
| 1 | 689 | |
| 2 | 594 | |
| 7 | 490 | |
| 4 | 464 | 7.7% |
| 6 | 422 | 7.0% |
| 0 | 406 | 6.7% |
| 5 | 387 | 6.4% |
| 9 | 328 | 5.5% |
| 8 | 282 | 4.7% |
| Other values (25) | 1207 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 311 | |
| 2 | 268 | |
| 7 | 207 | 7.2% |
| 6 | 206 | 7.2% |
| 0 | 204 | 7.1% |
| 5 | 195 | 6.8% |
| 4 | 188 | 6.5% |
| 8 | 144 | 5.0% |
| 9 | 137 | 4.8% |
| Other values (22) | 650 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 6015 |
| Value | Count | Frequency (%) |
| (unknown) | 2874 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 746 | |
| 1 | 689 | |
| 2 | 594 | |
| 7 | 490 | |
| 4 | 464 | 7.7% |
| 6 | 422 | 7.0% |
| 0 | 406 | 6.7% |
| 5 | 387 | 6.4% |
| 9 | 328 | 5.5% |
| 8 | 282 | 4.7% |
| Other values (25) | 1207 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 311 | |
| 2 | 268 | |
| 7 | 207 | 7.2% |
| 6 | 206 | 7.2% |
| 0 | 204 | 7.1% |
| 5 | 195 | 6.8% |
| 4 | 188 | 6.5% |
| 8 | 144 | 5.0% |
| 9 | 137 | 4.8% |
| Other values (22) | 650 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 6015 |
| Value | Count | Frequency (%) |
| (unknown) | 2874 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 746 | |
| 1 | 689 | |
| 2 | 594 | |
| 7 | 490 | |
| 4 | 464 | 7.7% |
| 6 | 422 | 7.0% |
| 0 | 406 | 6.7% |
| 5 | 387 | 6.4% |
| 9 | 328 | 5.5% |
| 8 | 282 | 4.7% |
| Other values (25) | 1207 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 311 | |
| 2 | 268 | |
| 7 | 207 | 7.2% |
| 6 | 206 | 7.2% |
| 0 | 204 | 7.1% |
| 5 | 195 | 6.8% |
| 4 | 188 | 6.5% |
| 8 | 144 | 5.0% |
| 9 | 137 | 4.8% |
| Other values (22) | 650 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 6015 |
| Value | Count | Frequency (%) |
| (unknown) | 2874 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 746 | |
| 1 | 689 | |
| 2 | 594 | |
| 7 | 490 | |
| 4 | 464 | 7.7% |
| 6 | 422 | 7.0% |
| 0 | 406 | 6.7% |
| 5 | 387 | 6.4% |
| 9 | 328 | 5.5% |
| 8 | 282 | 4.7% |
| Other values (25) | 1207 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 311 | |
| 2 | 268 | |
| 7 | 207 | 7.2% |
| 6 | 206 | 7.2% |
| 0 | 204 | 7.1% |
| 5 | 195 | 6.8% |
| 4 | 188 | 6.5% |
| 8 | 144 | 5.0% |
| 9 | 137 | 4.8% |
| Other values (22) | 650 |
Fare
Real number (ℝ)
| Train data report | Test data report | |
|---|---|---|
| Distinct | 248 | 169 |
| Distinct (%) | 27.8% | 40.5% |
| Missing | 0 | 1 |
| Missing (%) | 0.0% | 0.2% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 32.204208 | 35.627188 |
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 15 | 2 |
| Zeros (%) | 1.7% | 0.5% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
Quantile statistics
| Train data report | Test data report | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.225 | 7.2292 |
| Q1 | 7.9104 | 7.8958 |
| median | 14.4542 | 14.4542 |
| Q3 | 31 | 31.5 |
| 95-th percentile | 112.07915 | 151.55 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 23.0896 | 23.6042 |
Descriptive statistics
| Train data report | Test data report | |
|---|---|---|
| Standard deviation | 49.693429 | 55.907576 |
| Coefficient of variation (CV) | 1.5430725 | 1.5692391 |
| Kurtosis | 33.398141 | 17.921595 |
| Mean | 32.204208 | 35.627188 |
| Median Absolute Deviation (MAD) | 6.9042 | 6.825 |
| Skewness | 4.7873165 | 3.6872133 |
| Sum | 28693.949 | 14856.538 |
| Variance | 2469.4368 | 3125.6571 |
| Monotonicity | Not monotonic | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 8.05 | 43 | 4.8% |
| 13 | 42 | 4.7% |
| 7.8958 | 38 | 4.3% |
| 7.75 | 34 | 3.8% |
| 26 | 31 | 3.5% |
| 10.5 | 24 | 2.7% |
| 7.925 | 18 | 2.0% |
| 7.775 | 16 | 1.8% |
| 7.2292 | 15 | 1.7% |
| 0 | 15 | 1.7% |
| Other values (238) | 615 |
| Value | Count | Frequency (%) |
| 7.75 | 21 | 5.0% |
| 26 | 19 | 4.5% |
| 8.05 | 17 | 4.1% |
| 13 | 17 | 4.1% |
| 10.5 | 11 | 2.6% |
| 7.8958 | 11 | 2.6% |
| 7.775 | 10 | 2.4% |
| 7.2292 | 9 | 2.2% |
| 7.225 | 9 | 2.2% |
| 7.8542 | 8 | 1.9% |
| Other values (159) | 285 |
| Value | Count | Frequency (%) |
| 0 | 15 | |
| 4.0125 | 1 | 0.1% |
| 5 | 1 | 0.1% |
| 6.2375 | 1 | 0.1% |
| 6.4375 | 1 | 0.1% |
| 6.45 | 1 | 0.1% |
| 6.4958 | 2 | 0.2% |
| 6.75 | 2 | 0.2% |
| 6.8583 | 1 | 0.1% |
| 6.95 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 2 | 0.5% |
| 3.1708 | 1 | 0.2% |
| 6.4375 | 2 | 0.5% |
| 6.4958 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7 | 2 | 0.5% |
| 7.05 | 2 | 0.5% |
| 7.225 | 9 | |
| 7.2292 | 9 | |
| 7.25 | 5 |
| Value | Count | Frequency (%) |
| 0 | 2 | 0.2% |
| 3.1708 | 1 | 0.1% |
| 6.4375 | 2 | 0.2% |
| 6.4958 | 1 | 0.1% |
| 6.95 | 1 | 0.1% |
| 7 | 2 | 0.2% |
| 7.05 | 2 | 0.2% |
| 7.225 | 9 | |
| 7.2292 | 9 | |
| 7.25 | 5 |
| Value | Count | Frequency (%) |
| 0 | 15 | |
| 4.0125 | 1 | 0.2% |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.5% |
| 6.75 | 2 | 0.5% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
Cabin
['Text', 'Text']
| Train data report | Test data report | |
|---|---|---|
| Distinct | 147 | 76 |
| Distinct (%) | 72.1% | 83.5% |
| Missing | 687 | 327 |
| Missing (%) | 77.1% | 78.2% |
| Memory size | 7.1 KiB | 3.4 KiB |
Length
| Train data report | Test data report | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.5882353 | 4.0769231 |
| Min length | 1 | 1 |
Unique
| Train data report | Test data report | |
|---|---|---|
| Unique | 101 | 62 ? |
| Unique (%) | 49.5% | 68.1% |
Sample
| Train data report | Test data report | |
|---|---|---|
| 1st row | C85 | B45 |
| 2nd row | C123 | E31 |
| 3rd row | E46 | B57 B59 B63 B66 |
| 4th row | G6 | B36 |
| 5th row | C103 | A21 |
| Value | Count | Frequency (%) |
| c23 | 4 | 1.7% |
| c27 | 4 | 1.7% |
| g6 | 4 | 1.7% |
| b96 | 4 | 1.7% |
| b98 | 4 | 1.7% |
| f | 4 | 1.7% |
| c25 | 4 | 1.7% |
| f33 | 3 | 1.3% |
| e101 | 3 | 1.3% |
| f2 | 3 | 1.3% |
| Other values (151) | 201 |
| Value | Count | Frequency (%) |
| f | 4 | 3.4% |
| b57 | 3 | 2.5% |
| b63 | 3 | 2.5% |
| b66 | 3 | 2.5% |
| b59 | 3 | 2.5% |
| c27 | 2 | 1.7% |
| e46 | 2 | 1.7% |
| c6 | 2 | 1.7% |
| c78 | 2 | 1.7% |
| b45 | 2 | 1.7% |
| Other values (80) | 92 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 72 | 9.8% |
| C | 71 | 9.7% |
| B | 64 | 8.7% |
| 1 | 61 | 8.3% |
| 3 | 59 | 8.1% |
| 6 | 51 | 7.0% |
| 5 | 45 | 6.1% |
| 4 | 37 | 5.1% |
| 8 | 37 | 5.1% |
| 34 | 4.6% | |
| Other values (9) | 201 |
| Value | Count | Frequency (%) |
| C | 43 | |
| 5 | 34 | |
| 1 | 33 | 8.9% |
| B | 32 | 8.6% |
| 6 | 30 | 8.1% |
| 3 | 28 | 7.5% |
| 27 | 7.3% | |
| 2 | 25 | 6.7% |
| 4 | 21 | 5.7% |
| 7 | 15 | 4.0% |
| Other values (8) | 83 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 732 |
| Value | Count | Frequency (%) |
| (unknown) | 371 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 72 | 9.8% |
| C | 71 | 9.7% |
| B | 64 | 8.7% |
| 1 | 61 | 8.3% |
| 3 | 59 | 8.1% |
| 6 | 51 | 7.0% |
| 5 | 45 | 6.1% |
| 4 | 37 | 5.1% |
| 8 | 37 | 5.1% |
| 34 | 4.6% | |
| Other values (9) | 201 |
| Value | Count | Frequency (%) |
| C | 43 | |
| 5 | 34 | |
| 1 | 33 | 8.9% |
| B | 32 | 8.6% |
| 6 | 30 | 8.1% |
| 3 | 28 | 7.5% |
| 27 | 7.3% | |
| 2 | 25 | 6.7% |
| 4 | 21 | 5.7% |
| 7 | 15 | 4.0% |
| Other values (8) | 83 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 732 |
| Value | Count | Frequency (%) |
| (unknown) | 371 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 72 | 9.8% |
| C | 71 | 9.7% |
| B | 64 | 8.7% |
| 1 | 61 | 8.3% |
| 3 | 59 | 8.1% |
| 6 | 51 | 7.0% |
| 5 | 45 | 6.1% |
| 4 | 37 | 5.1% |
| 8 | 37 | 5.1% |
| 34 | 4.6% | |
| Other values (9) | 201 |
| Value | Count | Frequency (%) |
| C | 43 | |
| 5 | 34 | |
| 1 | 33 | 8.9% |
| B | 32 | 8.6% |
| 6 | 30 | 8.1% |
| 3 | 28 | 7.5% |
| 27 | 7.3% | |
| 2 | 25 | 6.7% |
| 4 | 21 | 5.7% |
| 7 | 15 | 4.0% |
| Other values (8) | 83 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 732 |
| Value | Count | Frequency (%) |
| (unknown) | 371 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 72 | 9.8% |
| C | 71 | 9.7% |
| B | 64 | 8.7% |
| 1 | 61 | 8.3% |
| 3 | 59 | 8.1% |
| 6 | 51 | 7.0% |
| 5 | 45 | 6.1% |
| 4 | 37 | 5.1% |
| 8 | 37 | 5.1% |
| 34 | 4.6% | |
| Other values (9) | 201 |
| Value | Count | Frequency (%) |
| C | 43 | |
| 5 | 34 | |
| 1 | 33 | 8.9% |
| B | 32 | 8.6% |
| 6 | 30 | 8.1% |
| 3 | 28 | 7.5% |
| 27 | 7.3% | |
| 2 | 25 | 6.7% |
| 4 | 21 | 5.7% |
| 7 | 15 | 4.0% |
| Other values (8) | 83 |
Embarked
Categorical
| Train data report | Test data report | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.3% | 0.7% |
| Missing | 2 | 0 |
| Missing (%) | 0.2% | 0.0% |
| Memory size | 7.1 KiB | 3.4 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Train data report | Test data report | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Unique
| Train data report | Test data report | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Train data report | Test data report | |
|---|---|---|
| 1st row | S | Q |
| 2nd row | C | S |
| 3rd row | S | Q |
| 4th row | S | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 644 | |
| C | 168 | 18.9% |
| Q | 77 | 8.6% |
| (Missing) | 2 | 0.2% |
| Value | Count | Frequency (%) |
| S | 270 | |
| C | 102 | 24.4% |
| Q | 46 | 11.0% |
Length
Histogram of lengths of the category
Common Values (Plot)
Train data report
Test data report
| Value | Count | Frequency (%) |
| s | 644 | |
| c | 168 | 18.9% |
| q | 77 | 8.7% |
| Value | Count | Frequency (%) |
| s | 270 | |
| c | 102 | 24.4% |
| q | 46 | 11.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 644 | |
| C | 168 | 18.9% |
| Q | 77 | 8.7% |
| Value | Count | Frequency (%) |
| S | 270 | |
| C | 102 | 24.4% |
| Q | 46 | 11.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 889 |
| Value | Count | Frequency (%) |
| (unknown) | 418 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 644 | |
| C | 168 | 18.9% |
| Q | 77 | 8.7% |
| Value | Count | Frequency (%) |
| S | 270 | |
| C | 102 | 24.4% |
| Q | 46 | 11.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 889 |
| Value | Count | Frequency (%) |
| (unknown) | 418 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 644 | |
| C | 168 | 18.9% |
| Q | 77 | 8.7% |
| Value | Count | Frequency (%) |
| S | 270 | |
| C | 102 | 24.4% |
| Q | 46 | 11.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 889 |
| Value | Count | Frequency (%) |
| (unknown) | 418 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 644 | |
| C | 168 | 18.9% |
| Q | 77 | 8.7% |
| Value | Count | Frequency (%) |
| S | 270 | |
| C | 102 | 24.4% |
| Q | 46 | 11.0% |
Interactions
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Train data report
Test data report
Correlations
Train data report
Test data report
Train data report
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.065 | 0.135 | -0.254 | 0.041 | 0.269 | 0.099 | -0.182 | 0.155 |
| Embarked | 0.065 | 1.000 | 0.196 | 0.052 | 0.000 | 0.260 | 0.113 | 0.092 | 0.166 |
| Fare | 0.135 | 0.196 | 1.000 | 0.410 | -0.014 | 0.479 | 0.189 | 0.447 | 0.283 |
| Parch | -0.254 | 0.052 | 0.410 | 1.000 | 0.001 | 0.022 | 0.247 | 0.450 | 0.157 |
| PassengerId | 0.041 | 0.000 | -0.014 | 0.001 | 1.000 | 0.032 | 0.066 | -0.061 | 0.104 |
| Pclass | 0.269 | 0.260 | 0.479 | 0.022 | 0.032 | 1.000 | 0.130 | 0.148 | 0.337 |
| Sex | 0.099 | 0.113 | 0.189 | 0.247 | 0.066 | 0.130 | 1.000 | 0.206 | 0.540 |
| SibSp | -0.182 | 0.092 | 0.447 | 0.450 | -0.061 | 0.148 | 0.206 | 1.000 | 0.187 |
| Survived | 0.155 | 0.166 | 0.283 | 0.157 | 0.104 | 0.337 | 0.540 | 0.187 | 1.000 |
Test data report
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | |
|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.135 | 0.315 | -0.130 | -0.019 | 0.349 | 0.000 | -0.015 |
| Embarked | 0.135 | 1.000 | 0.240 | 0.113 | 0.060 | 0.308 | 0.109 | 0.101 |
| Fare | 0.315 | 0.240 | 1.000 | 0.378 | 0.020 | 0.475 | 0.154 | 0.441 |
| Parch | -0.130 | 0.113 | 0.378 | 1.000 | 0.051 | 0.000 | 0.213 | 0.412 |
| PassengerId | -0.019 | 0.060 | 0.020 | 0.051 | 1.000 | 0.054 | 0.000 | -0.010 |
| Pclass | 0.349 | 0.308 | 0.475 | 0.000 | 0.054 | 1.000 | 0.106 | 0.113 |
| Sex | 0.000 | 0.109 | 0.154 | 0.213 | 0.000 | 0.106 | 1.000 | 0.136 |
| SibSp | -0.015 | 0.101 | 0.441 | 0.412 | -0.010 | 0.113 | 0.136 | 1.000 |
Missing values
Train data report
A simple visualization of nullity by column.
Test data report
A simple visualization of nullity by column.
Train data report
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Test data report
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Train data report
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
Test data report
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
Sample
Train data report
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S |
| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26.0 | 0 | 0 | STON/O2. 3101282 | 7.9250 | NaN | S |
| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.0 | 1 | 0 | 113803 | 53.1000 | C123 | S |
| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35.0 | 0 | 0 | 373450 | 8.0500 | NaN | S |
| 5 | 6 | 0 | 3 | Moran, Mr. James | male | NaN | 0 | 0 | 330877 | 8.4583 | NaN | Q |
| 6 | 7 | 0 | 1 | McCarthy, Mr. Timothy J | male | 54.0 | 0 | 0 | 17463 | 51.8625 | E46 | S |
| 7 | 8 | 0 | 3 | Palsson, Master. Gosta Leonard | male | 2.0 | 3 | 1 | 349909 | 21.0750 | NaN | S |
| 8 | 9 | 1 | 3 | Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) | female | 27.0 | 0 | 2 | 347742 | 11.1333 | NaN | S |
| 9 | 10 | 1 | 2 | Nasser, Mrs. Nicholas (Adele Achem) | female | 14.0 | 1 | 0 | 237736 | 30.0708 | NaN | C |
Test data report
| PassengerId | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 892 | 3 | Kelly, Mr. James | male | 34.5 | 0 | 0 | 330911 | 7.8292 | NaN | Q |
| 1 | 893 | 3 | Wilkes, Mrs. James (Ellen Needs) | female | 47.0 | 1 | 0 | 363272 | 7.0000 | NaN | S |
| 2 | 894 | 2 | Myles, Mr. Thomas Francis | male | 62.0 | 0 | 0 | 240276 | 9.6875 | NaN | Q |
| 3 | 895 | 3 | Wirz, Mr. Albert | male | 27.0 | 0 | 0 | 315154 | 8.6625 | NaN | S |
| 4 | 896 | 3 | Hirvonen, Mrs. Alexander (Helga E Lindqvist) | female | 22.0 | 1 | 1 | 3101298 | 12.2875 | NaN | S |
| 5 | 897 | 3 | Svensson, Mr. Johan Cervin | male | 14.0 | 0 | 0 | 7538 | 9.2250 | NaN | S |
| 6 | 898 | 3 | Connolly, Miss. Kate | female | 30.0 | 0 | 0 | 330972 | 7.6292 | NaN | Q |
| 7 | 899 | 2 | Caldwell, Mr. Albert Francis | male | 26.0 | 1 | 1 | 248738 | 29.0000 | NaN | S |
| 8 | 900 | 3 | Abrahim, Mrs. Joseph (Sophie Halaut Easu) | female | 18.0 | 0 | 0 | 2657 | 7.2292 | NaN | C |
| 9 | 901 | 3 | Davies, Mr. John Samuel | male | 21.0 | 2 | 0 | A/4 48871 | 24.1500 | NaN | S |
Train data report
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 881 | 882 | 0 | 3 | Markun, Mr. Johann | male | 33.0 | 0 | 0 | 349257 | 7.8958 | NaN | S |
| 882 | 883 | 0 | 3 | Dahlberg, Miss. Gerda Ulrika | female | 22.0 | 0 | 0 | 7552 | 10.5167 | NaN | S |
| 883 | 884 | 0 | 2 | Banfield, Mr. Frederick James | male | 28.0 | 0 | 0 | C.A./SOTON 34068 | 10.5000 | NaN | S |
| 884 | 885 | 0 | 3 | Sutehall, Mr. Henry Jr | male | 25.0 | 0 | 0 | SOTON/OQ 392076 | 7.0500 | NaN | S |
| 885 | 886 | 0 | 3 | Rice, Mrs. William (Margaret Norton) | female | 39.0 | 0 | 5 | 382652 | 29.1250 | NaN | Q |
| 886 | 887 | 0 | 2 | Montvila, Rev. Juozas | male | 27.0 | 0 | 0 | 211536 | 13.0000 | NaN | S |
| 887 | 888 | 1 | 1 | Graham, Miss. Margaret Edith | female | 19.0 | 0 | 0 | 112053 | 30.0000 | B42 | S |
| 888 | 889 | 0 | 3 | Johnston, Miss. Catherine Helen "Carrie" | female | NaN | 1 | 2 | W./C. 6607 | 23.4500 | NaN | S |
| 889 | 890 | 1 | 1 | Behr, Mr. Karl Howell | male | 26.0 | 0 | 0 | 111369 | 30.0000 | C148 | C |
| 890 | 891 | 0 | 3 | Dooley, Mr. Patrick | male | 32.0 | 0 | 0 | 370376 | 7.7500 | NaN | Q |
Test data report
| PassengerId | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 408 | 1300 | 3 | Riordan, Miss. Johanna Hannah"" | female | NaN | 0 | 0 | 334915 | 7.7208 | NaN | Q |
| 409 | 1301 | 3 | Peacock, Miss. Treasteall | female | 3.0 | 1 | 1 | SOTON/O.Q. 3101315 | 13.7750 | NaN | S |
| 410 | 1302 | 3 | Naughton, Miss. Hannah | female | NaN | 0 | 0 | 365237 | 7.7500 | NaN | Q |
| 411 | 1303 | 1 | Minahan, Mrs. William Edward (Lillian E Thorpe) | female | 37.0 | 1 | 0 | 19928 | 90.0000 | C78 | Q |
| 412 | 1304 | 3 | Henriksson, Miss. Jenny Lovisa | female | 28.0 | 0 | 0 | 347086 | 7.7750 | NaN | S |
| 413 | 1305 | 3 | Spector, Mr. Woolf | male | NaN | 0 | 0 | A.5. 3236 | 8.0500 | NaN | S |
| 414 | 1306 | 1 | Oliva y Ocana, Dona. Fermina | female | 39.0 | 0 | 0 | PC 17758 | 108.9000 | C105 | C |
| 415 | 1307 | 3 | Saether, Mr. Simon Sivertsen | male | 38.5 | 0 | 0 | SOTON/O.Q. 3101262 | 7.2500 | NaN | S |
| 416 | 1308 | 3 | Ware, Mr. Frederick | male | NaN | 0 | 0 | 359309 | 8.0500 | NaN | S |
| 417 | 1309 | 3 | Peter, Master. Michael J | male | NaN | 1 | 1 | 2668 | 22.3583 | NaN | C |
Duplicate rows
Train data report
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Test data report
| PassengerId | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | ||||||||||||